
{
	\AtBeginSection[]{} % Deactivate automatic section start slide for this frame
\section[Summary of part I]{Summary of part I: finite state and action spaces}  
}

\renewcommand{\thefigure}{S-I.\arabic{figure}} 

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% Outline / table of content %%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\begin{frame}{Table of contents}
	\begin{columns}
		\column{0.5\textwidth}
	  	\begin{varblock}{Summary of part I}
			Reinforcement learning in finite state and action spaces
	   	\end{varblock}
	\column{0.5\textwidth}
		\small
		  \tableofcontents[hideallsubsections, sections={1-7}]
	\end{columns}
\end{frame}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% Common Key Ideas to Explored RL Methods %%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\frame{\frametitle{Common key ideas to all discussed RL methods so far}
\begin{enumerate}
	\item Estimating and comparing value functions
	\item Backing up values along actual or possible state trajectories
	\item Usage of GPI mechanism to maintain an approximate value function and policy trying to improve each of them on the basis of the other
\end{enumerate}
\begin{figure}
	\subfloat{
		\includegraphics[height=4cm]{fig/lec03/GPI_01.pdf}
	}
	\hspace{1cm}
	\subfloat{
		\includegraphics[height=4cm]{fig/lec03/GPI_02.pdf}
	}
\caption{Generalized policy iteration (GPI) as a mutual building block of all previously discussed RL methods (source: R. Sutton and G. Barto, Reinforcement learning: an introduction, 2018, \href{https://creativecommons.org/licenses/by-nc-nd/2.0/}{CC BY-NC-ND 2.0})}
\end{figure}
}


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% Two Important RL Dimensions: Depth and Width of Updates %%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\frame{\frametitle{Two important RL dimensions: update depth and width}
\begin{figure}
	\includegraphics[width=7cm]{fig/lec06/Compare_RL_Methods_Update.pdf}
	\caption{A slice through the RL method space (source: R. Sutton and G. Barto, Reinforcement learning: an introduction, 2018, \href{https://creativecommons.org/licenses/by-nc-nd/2.0/}{CC BY-NC-ND 2.0})}
\end{figure}
}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% Other Important RL Dimensions %%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\frame{\frametitle{Other important rl dimensions}
Selected, non-exhaustive list:
\begin{itemize}
	\item \hl{Problem space}: How many states and actions? Stochastic vs. deterministic environment? Stationary?\pause
	\item \hl{Policy objective}: on-policy vs. off-policy? Explicit vs. implicit policy?\pause
	\item \hl{Task}: Episodic vs. continuing?\pause
	\item \hl{Return definition}: Discounting? General reward design?\pause
	\item \hl{Value}: State vs. action value estimation? \pause
	\item \hl{Model}: Required? Distribution vs. sample models? Learning vs. a priori (expert) knowledge?\pause 
	\item \hl{Exploration}: How to search for new policies?\pause
	\item \hl{Update order}: synchronous vs. asynchronous? If latter, which order?\pause
	\item \hl{Experience}: simulated vs. real experience? Memory length and style?
	\item ...
\end{itemize}
}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% Outlook %%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\frame{\frametitle{Outlook}
First part of the course:
\begin{block}{Reinforcement learning on small finite action and state spaces}
The problem space is such small that RL methods based on look-up tables are applicable.
\end{block}\pause
\vspace{0.5cm}
Second part of the course::
\begin{block}{Reinforcement learning using function approximators}
The problem space is either continuous or contains an unfeasible large amount of discrete state-action pairs. Value estimates, models or explicit policies stored in look-up tables would let the memory demand explode. Modifications and extensions of available RL algorithms using function approximators are required.    
\end{block}
}

\renewcommand{\thefigure}{\arabic{section}.\arabic{figure}}